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A generalized Markov chain representation of fault dynamics is presented for the case that available model- 
ing of fault growth physics and future environmental stresses can be represented by two independent stochastic 
process models. A contrived but representatively challenging example will be presented and analyzed, in which 
uncertainty in the modeling of fault growth physics is represented by a uniformly distributed dice throwing 
process, and a discrete random walk is used to represent uncertain modeling of future exogenous loading de- 
mands to be placed on the system. A finite horizon dynamic programming algorithm is used to solve for an 
optimal control policy over a finite time window for the case that stochastic models representing physics of 
failure and future environmental stresses are known, and the states of both stochastic processes are observable 
by implemented control routines. The fundamental limitations of optimization performed in the presence of 
uncertain modeling information are examined by comparing the outcomes obtained from simulations of an op- 
timizing control policy with the outcomes that would be achievable if all modeling uncertainties were removed 
from the system. 


I. Introduction 

Predicting the future dynamics of component fault modes is typically complicated by a lack of comprehensive run 
to failure data-sets that could be used to develop and validate prognostic models, and a limited knowledge of the 
future stresses to be placed on a system’s components. However, even highly uncertain prognostic predictions may 
still provide invaluable information that can be utilized to improve the operation of safety critical and expensive to 
maintain systems such as aircraft and spacecraft. 

Two types of modeling uncertainties affecting prognostic predictions are considered in this publication; these are clas- 
sified as uncertainty incorporated into physics of failure models and uncertainty in the estimation of future exogenous 
stresses on a system. Uncertainty in estimating the dynamics of component degradation as a function of the loads or 
stresses applied to a specimen is often included as a noise term in lumped parameter and data driven fault growth pre- 
diction models. 1-2 A bounded environmental disturbance term is typically used to incorporate uncertain modeling of 
future exogenous system stresses into the analysis of robust and reconfigurable control design techniques, such as II 
control, L \ control, and gain scheduling. 3 4 This paper is motivated by a desire to analyze an analytical formulation of 
the prognostics-based decision making problem, in which models of fault growth physics and environmental stresses 
are represented by two independent stochastic processes. An invented example of a dice throwing game is described to 
illustrate some fundamental challenges associated with assessing and managing the prognostic uncertainty introduced 
by incorporating stochastic modeling of physics of failure and future environmental stresses. Section II of the paper 
introduces a generalized representation of component degradation dynamics based on the modeled dynamics of future 
environmental loading conditions, the estimated statistics of a process noise term, and the actions taken by onboard 
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control routines in response to observations of environmental and component health states. Section III describes a 
contrived but challenging example of a system in which uncertainty in the modeling of fault growth physics is rep- 
resented by a uniformly distributed dice throwing process, and uncertain modeling of future environmental loading 
demands on the system is represented by a discrete random walk. The use of finite horizon dynamic programming to 
identify control policies that optimize the expectation of a given cost function over a finite time window is discussed 
in Section IV. Simulation results for selected optimized control policies on the dice throwing example are presented 
and analyzed in Section V. 

II. Stochastic Modeling of Fault Growth Dynamics 

Consider a Markov chain representation of component damage accumulation that is expressed in terms of a metric 
representing the load applied to a specimen at each control time-step, 

Pi,j (u) = Pr (7 (k + 1) = Sj\'y(k) = Si,u(k) = u) , 7 ,Si,Sj £ S, u € U (k) , fee N (1) 

E = 1 w 

SjGS 

where U ( k ) represents a domain of feasible loads that may be applied at a given time-instant, 7 is a random variable 
representing a component’s state of health (SOH), S represents a quantized state space for component SOH, and p,j (u) 
represents the probability of transitioning from SOH s, to SOH Sj given an applied load u. The component loading 
variable may represent pressure, force, torque, or a wide variety of other stressors that drive component damage. Eq. 
(2) specifies that the sum of all transition probabilities defined at each system state in the Markov chain representation 
of the damage accumulation process must always equal one. 

Let the component degradation dynamics be defined by the following generic nonlinear mapping: 

7 (fc + 1) = / (7 (&) , u (&),£(&)) (3) 

where £ (k) is a random variable representing uncertainty in physics of failure models. In this case, knowledge of the 
mapping function / and the statistics of £ could be used to evaluate the transition probability matrix defined in Eq. (1). 
The component loading profile u (k) is assumed to be dictated partly by the dynamics of the system’s operating envi- 
ronment, which may not be entirely predictable, and partly by the control actions taken in response to online obser- 
vations of environmental states and the component SOH. The relationship between environmental conditions, control 
actions selected by onboard control routines, and component loading is considered to be generically represented as: 

u(k) = p(k) ■ w (k) (4) 

where w is taken to represent the net load that would be exerted by a nominally controlled component at a given 
time-instant, and p is taken to represent a relative deviation from nominal component loading that is induced by the 
actions of an implemented prognostics-based control routine. The response of a nominal control law to variations 
in environmental conditions is assumed to be known, thus a stochastic model of w can be directly derived from a 
stochastic model of environmental conditions. A formal description of potential challenges and benefits associated 
with this definition of the action space for onboard control routines is given in ref. [5], 

Substitution of Eq. (4) into Eq. (3) yields: 

7 (k + 1) = / (7 (k) , p{k) ■ w (. k ) , £ (k)) (5) 

Component SOH dynamics are now defined in terms of the control actions taken at each control time-instant and 
the joint distribution of uncertain models for environmental conditions and fault growth physics. Given that the two 
sources of uncertainty can generally be assumed to be independent of each other, Eq. (1) can be expressed in terms of 
the metric p as follows: 

Pi,j (p) = E E Pr (' w = w ) ■ Pr (£ ( fc ) = 0 where / (Si, p-w,£) = Sj (6) 

w£W 

where W and S represent quantized state spaces for w and £ respectively. The random variable £ is considered to 
be represented by a stationary model of process noise and a potentially nonstationary distribution is considered for 
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representing exogenous inputs to the system. A process model for w is generically represented using the notation of 
conditional probabilities: 

Pr (w ( k ) = w) = Pr ( w ( k ) = w\w (k — 1) = v) ■ Pr (w(k — 1) = v) , Vw € W, k £ [a + 1, oo) (7) 

where a possibly uncertain observation of w at time-index a is used to initialize the random process. 

The two dimensional state transition matrix given in Eq. (1) can now be expressed as a four dimensional matrix that 
incorporates the process models for £ and w. 


P(i,j),(l,m) ip) H Pr (w (k) 


Vi\w ( k - 1) = Vm) ■ Pr (£ (fc) = £) where / (s i} p ■ v m , £) = Sj, 

Si,Sj € S, Vi,v m ew ( 8 ) 


where P(ij)^p m ) (p) represents the probability of transitioning from SOH s t and exogenous loading demand vi to SOH 
Sj and exogenous loading demand v rn . This notation enables available knowledge of system kinematics, fault growth 
modeling, exogenous demand modeling, measurement uncertainties, and modeling uncertainties to be represented 
by a finite set of state transition probabilities. The state transition probabilities may be chosen to approximate some 
analytical formulation for the stochastic process, or they may be identified purely by fitting a system model to a history 
of observations of the fault growth process, as is the case with hidden Markov model learning techniques. 6-7 


III. A Dice Throwing Analogue to the General Prognostics-Based Control Problem 

This section considers an invented dice throwing game that is intended to illustrate some of the fundamental difficulties 
in assessing and managing the modeling uncertainty introduced by incorporating stochastic models of fault growth 
physics and environmental loading demands. 

Consider a component degradation model of the form: 

j(k + 1) = 7 (k) — A • \u(k) \ ■ £ (k) (9) 

where the rate of component health lost is defined to be proportional to the magnitude of component load u multiplied 
by a process noise variable £, and A represents a constant of proportionality. 

Substituting Eq. (4) into Eq. (9) yields: 

7 (k + 1) = 7 (k) + A ■ p(k) ■ | to (k)\ ■ £ (k) (10) 

where p represents a degree of freedom that control routines may use to regulate component loading at the expense of 
degrading a system’s nominal performance, and w represents exogenous loading demands. 

Exogenous loading demands will be taken to be represented by the sum of consecutive throws of a three sided die 
with faces numbered {-1,0,1 }, constituting a discrete random walk process. The probability mass function for w at 
time-index k is: 


Pr (w ( k ) = w\w (k — 1) = v) = < 3 W v ^ ^ ^ , k G [a + 1, oo) (11) 

I 0 else 

where the variable a represents the time-index at which prognostic predictions are made. 

Process noise in the fault growth model will be taken to be represented by independent throws of a six sided die with 
faces numbered {.7, .8, .9, 1.1, 1.2, 1.3}. The probability mass function for £ at time-index k is: 


Pr (£(*)=£) 



£ e {-7, .8, .9, 1.1, 1.2, 1.3} 
else 


( 12 ) 


Both w and 7 are be assumed to be observable at all time-instants occurring before a, thus the fault growth processes 
may be represented by deterministic equations for time-indexes less than a. 
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Figure 1: Time-series plots for 100 simulations of w and £ (top) and the product of w and £ (bottom). 


A domain of feasible values for the control degree of freedom p is defined as p £ {.2, .3, 1}. The time-series 
behavior of the dice throwing system was simulated using a pseudo random number generator initialized with a unique 
‘seed’ value in each simulation. Figure 1 shows time-series profiles for |w|, £, and w • £ that were generated using a 
pseudo random number generator initialized with 100 unique seeds. Figure 2 shows simulation results for two sample 
control policies; one control policy commands no modifications to the system’s nominal load allocations and the other 
control policy commands the maximum allowable degradation of the system’s nominal load allocations. Note that 
modeling uncertainties included in these simulations becomes relatively large over the time-interval of interest, as 
is typical for prognostics-based control applications. Also note that the risk of component failure within the time 
frame of interest is seen to be relatively large for the case that the system’s nominal loading performance is not at all 
degraded, and the risk of component failure is seen to be essentially nonexistent for the case that the system’s loading 
performance is degraded to the maximum allowable extent. 

The box plots shown on the right hand side of Figure 2 provide a convenient means of representing the statistics of the 
fault growth process. The top and bottom of the boxes plotted in Figure 2 represent the 25 th and 15 th percentiles of 
the data range at a given time-index, the notch in each box represents the median value of the data points, the dashed 
line represents the mean value, and the whiskers in the box plots extend to the most extreme points falling within the 
range, 

<?i - 1-5 • (q 3 - gi) < di < q 3 + 1.5 • (q 3 - qi) (13) 

where q 3 and q 3 are the 25 th and 15 th quantiles of the data respectively, and di represents a datapoint. 

The next section describes the use of finite horizon dynamic programming for identifying control policies over a fixed 
time interval that optimize a given risk metric, where the risk metric quantifies a relative aversion to predictions of 
nominal loading performance degradations and predictions of fault growth over a given prognostic horizon. 


IV. Identifying Optimal Finite Horizon Control Policies With Dynamic Programming 

The prognostics -based control problem can generally be viewed as an optimization problem, in which an implemented 
control routine will select values of p at each control time-step in an attempt to minimize the risks posed by the 
continued application of load to degrading components, while also minimizing deviation from a system’s nominal 
output loading performance. The stochastic models given in Eqs. (10-12) yield predictions of a system’s future 
deviation from its nominal output loading performance and predictions of future component health deterioration in 
terms of the controlled variable p. 

If aversion to the potential degradation of a system’s nominal output loading performance and aversion to the potential 
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Figure 2: Time-series data (left) and box plots (right) for 100 simulations of the dice throwing process with p = 0.2, 
representing the maximum allowable degradation of the system’s nominal load allocation (top), and with p = 1 , 
representing no degradation of the system’s nominal load allocation (bottom). 


degradation of component SOH are expressed in terms of an expectation of accumulated state transition costs, then 
the search for a control policy that optimizes a stochastic system of the form disclosed in Eq. ( 8 ) is expressed as a 
Markov decision process (MDP). MDPs are commonly used to analyze problems involving decision making in the 
presence of uncertain or stochastic information. Optimizing control policies for MDPs may be identified using well 
known MDP optimization techniques such as dynamic programming for finite horizon optimization problems, and 
linear programming, value iteration, or policy iteration for discounted and average -reward infinite horizon optimization 
problems. 

An MDP formulation of the prognostics-based control problem will optimize a cost function that is defined as: 

JV-1 

9N (7 (N)) +^gk{ 7 (fc) ,p(k),w(k),£ (fc)) (14) 

fc= 0 

where state transition costs, denoted by g *. (7 (fc) , p (k) , w ( k ) , £ (fc)), assign a cost to the possibility of transitioning 
from one system state to another at time-index fc, and a terminal cost, denoted by //y (7 (N)), penalizes the total 
degradation component SOH over a simulated time window. 

If the values to be taken by the random variables w and £ are known over the time window k = [0, N - 1], then Eq. 
(14) could be optimized directly; however, because future values of w and £ are unknown in prognostic applications, 
the optimization problem is formulated in terms of the expectation of summed state transition costs, 

E I 9n (7 ( N )) +Yl 9 k('Y (k) ,p(k),w (k) ,£(&)) j (15) 

l fc= 0 J 


A sequence of control actions taken over the domain k = [0, AT — 1] is referred to as a control policy, and is denoted 
as 7 T = {po , ..., pn-i}, where pf. maps observations of 7 and w obtained at time-index k into a control action, 

P (k) = pk (7 (k) , w (k)) (16) 


The expected cost of enacting a particular control policy 7 r when starting at given initial values of 7 and w is: 

J* (7 (0) , w (0)) = E l g N (7 (N)) + ^2 fffc (7 (k) , Pk (x (k) , w (k)) , w (k) , £ (fc)) 1 (17) 

l fc= 0 J 

where an optimal control policy is defined as a one that minimizes this cost, 

Jtt* (7 ( 0 ) j w ( 0 )) = min J n ( 7 ( 0 ) ,w ( 0 )) (18) 

7r6ll 
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Here n* represents an optimal control policy. Optimal control policies may be identified over a finite time win- 
dow using the well known dynamic programming algorithm. This algorithm implements a backwards induction 
searching strategy to identify an optimal control policy over the time window k £ {N — 2, N — 1}, and then for 
k £ {N — 3, N — 2, N — 1} and so on, until the optimal policy is found over the entire window of interest. 


Jn (7 ( N ) . w (N)) = 9 n (7 ( N )) 

P* k ( 7 , w) = min p E {g k (7 (k) ,p,w,£ (k)) + J k+ i (/ ( 7 , P , w, £ (&)) ,w(k + 1))} 

Jk ( 7 , w) = E {g k (7 (k) , p* k (7, w),w,£ (k)) + J fc+ i (/ (7, p% (7, w),w,£ (k )) , w (k + 1))} 
7 £ S, w £ W, tv* = {po,...,p* N _ 1 } , k = {0, 1} 


V. Simulation Results 

The dynamic programming algorithm given in Eq. (19) was used to identify an optimizing control policy over a given 
finite time window for the dice throwing game defined in Eqs. (10-12). State transition costs for the dice throwing 
game were designated to penalize the proportional deviation from the system’s nominal output loading commands at 
each control time-index 


gk (7 (fc) , Pk ( X (. k ) , w (k)) , w (k) , € (k)) = 1 - p k (x (k) , w (k)) (20) 

The terminating cost was designated to be proportional to the square of component SOH at the end of the time-window. 

9n (7 (N)) = 100 (1 — 7 {N )) 2 (21) 

Figure 3a shows the distribution of component SOH observed at several time-increments over 100 repeated simula- 
tions of the control policy that optimizes the expected value of the cost function defined in Eq. (20) and (21). For 
comparative purposes, the profiles observed for the system’s random variables in each simulation were fed into a non- 
causal implementation of dynamic programming, that solved for the optimal control policy given perfect information 
for the future values to be taken by the system’s random variables. Figure 3b shows the distribution of component SOH 
profiles observed over repeated simulations of the optimal control policy using perfect future knowledge. The mean 
and standard deviation of the total costs evaluated over repeated simulations of the two optimized control policies and 
the two example control policies discussed in Section III are given in Table 1 . 

The results given in Table 1 show that the performance of the control policy identified by dynamic programming 
is clearly superior to the two sample control policies discussed in Section III. Comparison of the control policy 
computed using a stochastic model of the system with the control policy computed using perfect future knowledge, 
shows a somewhat more conservative behavior early in the mission from the control policy lacking prefect future 
knowledge, which is to be expected. Also, note that, even though the uncertainty included in the dynamics of the dice 
throwing game is seen to be substantial, relatively little variation is observed over repeated simulations of fault growth 
dynamics when the computed optimal control policy is used, indicating good disturbance rejection in the controlled 
system. The formulation and implementation of prognostics-based control on the dice throwing example is intended 
to be structurally analogous to the prognostics-based control problem on a wide variety of real-world applications, and 
future work will demonstrate the adaptation of the notational tools developed in this publication to more complex and 
practical problems. 



avg (J n ) 

std (J K ) 

Control policy enacting no deviation from nominal load allocations 

81.6 

26.7 

Control policy enacting maximum deviation from nominal load allocations 

79.6 

10.5 

Optimal control policy computed using stochastic modeling 

65.7 

21.1 

Control policies computed using perfect future knowledge of each simulation 

56.6 

23.6 


Table 1 : Statistics of the total costs computed over repeated simulations of four simulated control policies 
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Figure 3: Plots of component SOH over repeated simulations of the optimal control policy computed with stochastic 
models (a) and plots of component SOH using an optimal control computed for the particular evolution of random 
variables observed in each simulation of the dice throwing game (b). 


VI. Conclusions 

A contrived example of a dice throwing game was considered in order to provide some insight into the general prob- 
lem of developing prognostics-based control routines that utilize uncertain models of component fault dynamics and 
future environmental stresses to assess and mitigate risk. A generalized Markov modeling representation of fault dy- 
namics was developed for the case that available modeling of fault growth physics and available modeling of future 
environmental stresses may be represented by two independent Markov process models. A finite horizon dynamic 
programming algorithm was given for a generalized Markov decision process representation of the prognostics-based 
control problem and was used to identify an optimal control policy for the dice throwing game that is considered in this 
paper. The outcomes obtained from simulations of the optimizing control policy were observed to differ only slightly 
from the outcomes that would have been achieved if all modeling uncertainties were removed from the example dice 
throwing game. 
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